Overview

Dataset statistics

Number of variables12
Number of observations5693
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory533.8 KiB
Average record size in memory96.0 B

Variable types

Numeric12

Alerts

gross_revenue is highly correlated with qty_invoice_no and 3 other fieldsHigh correlation
recency_days is highly correlated with qty_invoice_noHigh correlation
qty_invoice_no is highly correlated with gross_revenue and 5 other fieldsHigh correlation
qty_items is highly correlated with gross_revenue and 4 other fieldsHigh correlation
qty_products is highly correlated with gross_revenue and 4 other fieldsHigh correlation
frequency is highly correlated with qty_invoice_no and 1 other fieldsHigh correlation
qty_returns is highly correlated with qty_invoice_noHigh correlation
avg_basket_size is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with qty_products and 1 other fieldsHigh correlation
gross_revenue is highly correlated with qty_invoice_no and 1 other fieldsHigh correlation
qty_invoice_no is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qty_items is highly correlated with gross_revenue and 1 other fieldsHigh correlation
qty_products is highly correlated with qty_invoice_noHigh correlation
gross_revenue is highly correlated with qty_invoice_no and 3 other fieldsHigh correlation
qty_invoice_no is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qty_items is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qty_products is highly correlated with gross_revenue and 2 other fieldsHigh correlation
frequency is highly correlated with qty_invoice_noHigh correlation
avg_basket_size is highly correlated with gross_revenue and 1 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with qty_productsHigh correlation
df_index is highly correlated with customer_id and 1 other fieldsHigh correlation
customer_id is highly correlated with df_index and 1 other fieldsHigh correlation
gross_revenue is highly correlated with qty_invoice_no and 3 other fieldsHigh correlation
recency_days is highly correlated with df_index and 1 other fieldsHigh correlation
qty_invoice_no is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qty_items is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qty_products is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_ticket is highly correlated with qty_returns and 1 other fieldsHigh correlation
qty_returns is highly correlated with gross_revenue and 5 other fieldsHigh correlation
avg_basket_size is highly correlated with avg_ticket and 2 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with avg_basket_sizeHigh correlation
gross_revenue is highly skewed (γ1 = 23.14756188) Skewed
qty_items is highly skewed (γ1 = 25.0992138) Skewed
avg_ticket is highly skewed (γ1 = 24.95802561) Skewed
qty_returns is highly skewed (γ1 = 30.34012229) Skewed
df_index is uniformly distributed Uniform
df_index has unique values Unique
customer_id has unique values Unique
qty_returns has 4191 (73.6%) zeros Zeros

Reproduction

Analysis started2022-04-12 00:51:56.255249
Analysis finished2022-04-12 00:52:10.985659
Duration14.73 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct5693
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2894.801159
Minimum0
Maximum5783
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-04-11T21:52:11.062894image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile289.6
Q11454
median2897
Q34339
95-th percentile5492.4
Maximum5783
Range5783
Interquartile range (IQR)2885

Descriptive statistics

Standard deviation1668.359199
Coefficient of variation (CV)0.5763294635
Kurtosis-1.196239465
Mean2894.801159
Median Absolute Deviation (MAD)1443
Skewness-0.003575526764
Sum16480103
Variance2783422.417
MonotonicityStrictly increasing
2022-04-11T21:52:11.160229image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
38881
 
< 0.1%
38641
 
< 0.1%
38631
 
< 0.1%
38621
 
< 0.1%
38611
 
< 0.1%
38601
 
< 0.1%
38591
 
< 0.1%
38581
 
< 0.1%
38571
 
< 0.1%
Other values (5683)5683
99.8%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
57831
< 0.1%
57821
< 0.1%
57811
< 0.1%
57801
< 0.1%
57791
< 0.1%
57781
< 0.1%
57771
< 0.1%
57761
< 0.1%
57751
< 0.1%
57741
< 0.1%

customer_id
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct5693
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16601.61022
Minimum12347
Maximum22709
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-04-11T21:52:11.263043image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12700.6
Q114289
median16229
Q318211
95-th percentile21732.8
Maximum22709
Range10362
Interquartile range (IQR)3922

Descriptive statistics

Standard deviation2808.098224
Coefficient of variation (CV)0.1691461362
Kurtosis-0.8214808648
Mean16601.61022
Median Absolute Deviation (MAD)1962
Skewness0.4410508271
Sum94512967
Variance7885415.636
MonotonicityNot monotonic
2022-04-11T21:52:11.364847image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
178501
 
< 0.1%
211111
 
< 0.1%
164981
 
< 0.1%
137451
 
< 0.1%
155841
 
< 0.1%
210891
 
< 0.1%
210881
 
< 0.1%
210871
 
< 0.1%
210861
 
< 0.1%
155781
 
< 0.1%
Other values (5683)5683
99.8%
ValueCountFrequency (%)
123471
< 0.1%
123481
< 0.1%
123491
< 0.1%
123501
< 0.1%
123521
< 0.1%
123531
< 0.1%
123541
< 0.1%
123551
< 0.1%
123561
< 0.1%
123571
< 0.1%
ValueCountFrequency (%)
227091
< 0.1%
227081
< 0.1%
227071
< 0.1%
227061
< 0.1%
227051
< 0.1%
227041
< 0.1%
227001
< 0.1%
226991
< 0.1%
226961
< 0.1%
226951
< 0.1%

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct5447
Distinct (%)95.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1754.357676
Minimum0.42
Maximum279138.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-04-11T21:52:11.463597image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.42
5-th percentile13.128
Q1236.09
median612.78
Q31569.11
95-th percentile5260.23
Maximum279138.02
Range279137.6
Interquartile range (IQR)1333.02

Descriptive statistics

Standard deviation7500.337959
Coefficient of variation (CV)4.275261574
Kurtosis704.3157538
Mean1754.357676
Median Absolute Deviation (MAD)478.74
Skewness23.14756188
Sum9987558.25
Variance56255069.5
MonotonicityNot monotonic
2022-04-11T21:52:11.553438image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.959
 
0.2%
1.258
 
0.1%
2.958
 
0.1%
4.958
 
0.1%
1.657
 
0.1%
12.757
 
0.1%
3.757
 
0.1%
5.956
 
0.1%
4.256
 
0.1%
7.56
 
0.1%
Other values (5437)5621
98.7%
ValueCountFrequency (%)
0.421
 
< 0.1%
0.651
 
< 0.1%
0.791
 
< 0.1%
0.844
0.1%
0.853
 
0.1%
1.071
 
< 0.1%
1.258
0.1%
1.441
 
< 0.1%
1.657
0.1%
1.691
 
< 0.1%
ValueCountFrequency (%)
279138.021
< 0.1%
259657.31
< 0.1%
194550.791
< 0.1%
140450.721
< 0.1%
124564.531
< 0.1%
117379.631
< 0.1%
91062.381
< 0.1%
72882.091
< 0.1%
66653.561
< 0.1%
65039.621
< 0.1%

recency_days
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct304
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean116.8796768
Minimum0
Maximum373
Zeros37
Zeros (%)0.6%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-04-11T21:52:11.654866image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q123
median71
Q3200
95-th percentile338
Maximum373
Range373
Interquartile range (IQR)177

Descriptive statistics

Standard deviation111.6285712
Coefficient of variation (CV)0.9550725521
Kurtosis-0.641033991
Mean116.8796768
Median Absolute Deviation (MAD)61
Skewness0.8151071897
Sum665396
Variance12460.93791
MonotonicityNot monotonic
2022-04-11T21:52:11.752328image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1110
 
1.9%
4105
 
1.8%
398
 
1.7%
292
 
1.6%
1086
 
1.5%
882
 
1.4%
1779
 
1.4%
979
 
1.4%
778
 
1.4%
1566
 
1.2%
Other values (294)4818
84.6%
ValueCountFrequency (%)
037
 
0.6%
1110
1.9%
292
1.6%
398
1.7%
4105
1.8%
552
0.9%
778
1.4%
882
1.4%
979
1.4%
1086
1.5%
ValueCountFrequency (%)
37323
0.4%
37223
0.4%
37117
0.3%
3694
 
0.1%
36813
0.2%
36716
0.3%
36615
0.3%
36519
0.3%
36411
0.2%
3627
 
0.1%

qty_invoice_no
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct56
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.471807483
Minimum1
Maximum206
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-04-11T21:52:11.854895image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q34
95-th percentile11
Maximum206
Range205
Interquartile range (IQR)3

Descriptive statistics

Standard deviation6.814409585
Coefficient of variation (CV)1.962784405
Kurtosis301.9942725
Mean3.471807483
Median Absolute Deviation (MAD)0
Skewness13.19072233
Sum19765
Variance46.43617799
MonotonicityNot monotonic
2022-04-11T21:52:11.948122image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12870
50.4%
2825
 
14.5%
3502
 
8.8%
4394
 
6.9%
5237
 
4.2%
6173
 
3.0%
7138
 
2.4%
898
 
1.7%
969
 
1.2%
1055
 
1.0%
Other values (46)332
 
5.8%
ValueCountFrequency (%)
12870
50.4%
2825
 
14.5%
3502
 
8.8%
4394
 
6.9%
5237
 
4.2%
6173
 
3.0%
7138
 
2.4%
898
 
1.7%
969
 
1.2%
1055
 
1.0%
ValueCountFrequency (%)
2061
< 0.1%
1991
< 0.1%
1241
< 0.1%
971
< 0.1%
912
< 0.1%
861
< 0.1%
721
< 0.1%
622
< 0.1%
601
< 0.1%
571
< 0.1%

qty_items
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1839
Distinct (%)32.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean951.7071843
Minimum1
Maximum196844
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-04-11T21:52:12.044765image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q1106
median317
Q3804
95-th percentile2925.2
Maximum196844
Range196843
Interquartile range (IQR)698

Descriptive statistics

Standard deviation4189.908875
Coefficient of variation (CV)4.402518909
Kurtosis942.5110255
Mean951.7071843
Median Absolute Deviation (MAD)253
Skewness25.0992138
Sum5418069
Variance17555336.38
MonotonicityNot monotonic
2022-04-11T21:52:12.143546image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1114
 
2.0%
273
 
1.3%
351
 
0.9%
449
 
0.9%
535
 
0.6%
629
 
0.5%
1225
 
0.4%
8822
 
0.4%
7221
 
0.4%
720
 
0.4%
Other values (1829)5254
92.3%
ValueCountFrequency (%)
1114
2.0%
273
1.3%
351
0.9%
449
0.9%
535
 
0.6%
629
 
0.5%
720
 
0.4%
818
 
0.3%
97
 
0.1%
1017
 
0.3%
ValueCountFrequency (%)
1968441
< 0.1%
802631
< 0.1%
773731
< 0.1%
699931
< 0.1%
645491
< 0.1%
641241
< 0.1%
633121
< 0.1%
583431
< 0.1%
578851
< 0.1%
502551
< 0.1%

qty_products
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct529
Distinct (%)9.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean92.64131389
Minimum1
Maximum7838
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-04-11T21:52:12.248338image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q114
median41
Q3106
95-th percentile332.4
Maximum7838
Range7837
Interquartile range (IQR)92

Descriptive statistics

Standard deviation210.6087384
Coefficient of variation (CV)2.273378146
Kurtosis510.1702143
Mean92.64131389
Median Absolute Deviation (MAD)33
Skewness17.75158488
Sum527407
Variance44356.04068
MonotonicityNot monotonic
2022-04-11T21:52:12.346614image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1255
 
4.5%
2149
 
2.6%
3107
 
1.9%
10101
 
1.8%
699
 
1.7%
992
 
1.6%
591
 
1.6%
487
 
1.5%
1183
 
1.5%
783
 
1.5%
Other values (519)4546
79.9%
ValueCountFrequency (%)
1255
4.5%
2149
2.6%
3107
1.9%
487
 
1.5%
591
 
1.6%
699
 
1.7%
783
 
1.5%
881
 
1.4%
992
 
1.6%
10101
 
1.8%
ValueCountFrequency (%)
78381
< 0.1%
56731
< 0.1%
50951
< 0.1%
45801
< 0.1%
26981
< 0.1%
23791
< 0.1%
20601
< 0.1%
18181
< 0.1%
16731
< 0.1%
16371
< 0.1%

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct5500
Distinct (%)96.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.86799676
Minimum0.42
Maximum4453.43
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-04-11T21:52:12.449880image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.42
5-th percentile3.459660952
Q17.95
median15.85181818
Q321.97516949
95-th percentile75.86392342
Maximum4453.43
Range4453.01
Interquartile range (IQR)14.02516949

Descriptive statistics

Standard deviation115.5612063
Coefficient of variation (CV)4.003090594
Kurtosis791.1895729
Mean28.86799676
Median Absolute Deviation (MAD)7.493397129
Skewness24.95802561
Sum164345.5056
Variance13354.3924
MonotonicityNot monotonic
2022-04-11T21:52:12.542763image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.7511
 
0.2%
4.9510
 
0.2%
2.959
 
0.2%
1.259
 
0.2%
7.958
 
0.1%
1.657
 
0.1%
8.257
 
0.1%
12.757
 
0.1%
3.356
 
0.1%
156
 
0.1%
Other values (5490)5613
98.6%
ValueCountFrequency (%)
0.423
0.1%
0.5351
 
< 0.1%
0.651
 
< 0.1%
0.791
 
< 0.1%
0.83714285711
 
< 0.1%
0.842
< 0.1%
0.853
0.1%
1.0022222221
 
< 0.1%
1.021
 
< 0.1%
1.038751
 
< 0.1%
ValueCountFrequency (%)
4453.431
< 0.1%
38611
< 0.1%
3202.921
< 0.1%
30961
< 0.1%
1687.21
< 0.1%
1377.0777781
< 0.1%
1001.21
< 0.1%
952.98751
< 0.1%
931.51
< 0.1%
872.131
< 0.1%

frequency
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct1226
Distinct (%)21.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5467863142
Minimum0.005449591281
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-04-11T21:52:12.641735image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.01104159896
Q10.02492211838
median1
Q31
95-th percentile1
Maximum17
Range16.99455041
Interquartile range (IQR)0.9750778816

Descriptive statistics

Standard deviation0.5493600313
Coefficient of variation (CV)1.004706989
Kurtosis140.0664152
Mean0.5467863142
Median Absolute Deviation (MAD)0
Skewness4.871258314
Sum3112.854487
Variance0.301796444
MonotonicityNot monotonic
2022-04-11T21:52:12.738407image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12878
50.6%
247
 
0.8%
0.062518
 
0.3%
0.0277777777817
 
0.3%
0.0238095238116
 
0.3%
0.0833333333315
 
0.3%
0.0909090909115
 
0.3%
0.0344827586214
 
0.2%
0.0294117647114
 
0.2%
0.0357142857113
 
0.2%
Other values (1216)2646
46.5%
ValueCountFrequency (%)
0.0054495912811
 
< 0.1%
0.0054644808741
 
< 0.1%
0.0054794520551
 
< 0.1%
0.0054945054951
 
< 0.1%
0.0055865921792
< 0.1%
0.0056022408961
 
< 0.1%
0.0056179775282
< 0.1%
0.005665722381
 
< 0.1%
0.0056818181822
< 0.1%
0.0056980056983
0.1%
ValueCountFrequency (%)
171
 
< 0.1%
41
 
< 0.1%
34
 
0.1%
247
 
0.8%
1.1428571431
 
< 0.1%
12878
50.6%
0.751
 
< 0.1%
0.66666666673
 
0.1%
0.5508021391
 
< 0.1%
0.53351206431
 
< 0.1%

qty_returns
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct213
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.23256631
Minimum0
Maximum9014
Zeros4191
Zeros (%)73.6%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-04-11T21:52:12.842270image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile38
Maximum9014
Range9014
Interquartile range (IQR)1

Descriptive statistics

Standard deviation204.9662983
Coefficient of variation (CV)11.24176898
Kurtosis1138.689658
Mean18.23256631
Median Absolute Deviation (MAD)0
Skewness30.34012229
Sum103798
Variance42011.18343
MonotonicityNot monotonic
2022-04-11T21:52:12.936620image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
04191
73.6%
1169
 
3.0%
2150
 
2.6%
3105
 
1.8%
489
 
1.6%
678
 
1.4%
561
 
1.1%
1252
 
0.9%
744
 
0.8%
843
 
0.8%
Other values (203)711
 
12.5%
ValueCountFrequency (%)
04191
73.6%
1169
 
3.0%
2150
 
2.6%
3105
 
1.8%
489
 
1.6%
561
 
1.1%
678
 
1.4%
744
 
0.8%
843
 
0.8%
941
 
0.7%
ValueCountFrequency (%)
90141
< 0.1%
80041
< 0.1%
44271
< 0.1%
37681
< 0.1%
33321
< 0.1%
28781
< 0.1%
20221
< 0.1%
20121
< 0.1%
17761
< 0.1%
15941
< 0.1%

avg_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2367
Distinct (%)41.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean248.1620234
Minimum1
Maximum14149
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-04-11T21:52:13.030221image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q175
median151.6666667
Q3290
95-th percentile732.55
Maximum14149
Range14148
Interquartile range (IQR)215

Descriptive statistics

Standard deviation439.5046618
Coefficient of variation (CV)1.771039162
Kurtosis378.6486311
Mean248.1620234
Median Absolute Deviation (MAD)96.66666667
Skewness14.57309198
Sum1412786.399
Variance193164.3477
MonotonicityNot monotonic
2022-04-11T21:52:13.125310image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1115
 
2.0%
272
 
1.3%
351
 
0.9%
449
 
0.9%
535
 
0.6%
629
 
0.5%
1226
 
0.5%
10022
 
0.4%
7222
 
0.4%
8821
 
0.4%
Other values (2357)5251
92.2%
ValueCountFrequency (%)
1115
2.0%
272
1.3%
351
0.9%
3.3333333331
 
< 0.1%
449
0.9%
535
 
0.6%
5.3333333331
 
< 0.1%
5.6666666671
 
< 0.1%
629
 
0.5%
6.1428571431
 
< 0.1%
ValueCountFrequency (%)
141491
< 0.1%
139561
< 0.1%
78241
< 0.1%
6009.3333331
< 0.1%
59631
< 0.1%
51971
< 0.1%
43001
< 0.1%
42821
< 0.1%
42801
< 0.1%
41361
< 0.1%

avg_unique_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1171
Distinct (%)20.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.26861967
Minimum0.2
Maximum1109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-04-11T21:52:13.230697image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.2
5-th percentile1
Q17.25
median15
Q331
95-th percentile173
Maximum1109
Range1108.8
Interquartile range (IQR)23.75

Descriptive statistics

Standard deviation76.8933091
Coefficient of variation (CV)2.063218595
Kurtosis32.87862237
Mean37.26861967
Median Absolute Deviation (MAD)9.941176471
Skewness5.072820295
Sum212170.2518
Variance5912.580985
MonotonicityNot monotonic
2022-04-11T21:52:13.326665image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1277
 
4.9%
2161
 
2.8%
3115
 
2.0%
10105
 
1.8%
9105
 
1.8%
8103
 
1.8%
7101
 
1.8%
6101
 
1.8%
5101
 
1.8%
1397
 
1.7%
Other values (1161)4427
77.8%
ValueCountFrequency (%)
0.21
 
< 0.1%
0.253
 
0.1%
0.33333333336
0.1%
0.41
 
< 0.1%
0.40909090911
 
< 0.1%
0.512
0.2%
0.54545454551
 
< 0.1%
0.55555555561
 
< 0.1%
0.57142857141
 
< 0.1%
0.61764705881
 
< 0.1%
ValueCountFrequency (%)
11091
< 0.1%
7481
< 0.1%
7301
< 0.1%
7201
< 0.1%
7031
< 0.1%
6861
< 0.1%
6751
< 0.1%
6731
< 0.1%
6601
< 0.1%
6491
< 0.1%

Interactions

2022-04-11T21:52:09.426497image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:57.852868image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:58.922336image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:59.933740image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:00.964619image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:02.008766image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:03.011939image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:04.270168image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:05.372007image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:06.343928image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:07.379004image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:08.378331image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:09.510601image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:57.982044image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:59.006091image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:00.017864image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:01.050182image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:02.089234image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:03.302873image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:04.358435image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:05.451953image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:06.428085image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:07.459914image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:08.464890image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:09.818037image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:58.067214image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:59.087491image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:00.101696image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:01.136420image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:02.168399image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:03.387720image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:04.444103image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:05.531209image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:06.511809image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:07.542390image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:08.549006image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:09.900703image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:58.152889image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:59.168232image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:00.184344image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:01.219684image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:02.247707image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:03.474499image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:04.533317image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:05.609221image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:06.594245image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:07.621973image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:08.634171image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:09.987684image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:58.239784image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:59.253539image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:00.271120image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:01.308105image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:02.330507image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:03.566461image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:04.639277image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:05.695398image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:06.685003image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:07.708855image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:08.722955image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:10.071198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:58.321261image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:59.331036image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:00.351458image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:01.390600image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:02.405919image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:03.652450image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:04.725754image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:05.773124image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:06.766740image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:07.785516image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:08.806426image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:10.169746image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:58.410932image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:59.421101image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:00.458668image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:01.482472image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:02.494149image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:03.746603image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:04.820106image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:05.860007image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:06.857820image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:07.884320image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:08.901136image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:10.257244image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:58.500346image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:59.512064image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:00.551147image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:01.575142image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:02.580310image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:03.838515image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:04.913282image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:05.946026image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:06.949039image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:07.973229image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:08.992621image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:10.337077image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:58.579426image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:59.598376image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:00.631588image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:01.664074image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:02.687565image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:03.921294image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:04.997053image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:06.021774image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:07.030211image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:08.056419image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:09.074900image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:10.423235image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:58.666219image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:59.685292image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:00.719020image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:01.751286image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:02.770115image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:04.010590image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:05.087376image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:06.105258image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:07.121100image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:08.140424image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:09.168740image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:10.501112image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:58.744843image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:59.764172image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:00.797415image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:01.832903image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:02.845473image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:04.092559image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:05.196930image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:06.180317image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:07.203281image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:08.215323image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:09.251750image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:10.587546image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:58.831737image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:51:59.850674image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:00.883027image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:01.923986image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:02.933405image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:04.183011image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:05.287498image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:06.264499image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:07.293070image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:08.299294image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-11T21:52:09.341904image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-04-11T21:52:13.429937image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-04-11T21:52:13.559800image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-04-11T21:52:13.682024image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-04-11T21:52:13.804625image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-04-11T21:52:10.735250image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-04-11T21:52:10.900033image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexcustomer_idgross_revenuerecency_daysqty_invoice_noqty_itemsqty_productsavg_ticketfrequencyqty_returnsavg_basket_sizeavg_unique_basket_size
00178505391.21372.034.01733.0297.018.15222217.00000040.050.9705880.617647
11130473232.5956.09.01390.0171.018.9040350.02830235.0154.44444411.666667
22125836705.382.015.05028.0232.028.9025000.04032350.0335.2000007.600000
3313748948.2595.05.0439.028.033.8660710.0179210.087.8000004.800000
4415100876.00333.03.080.03.0292.0000000.07317122.026.6666670.333333
55152914623.3025.014.02102.0102.045.3264710.04011529.0150.1428574.357143
66146885630.877.021.03621.0327.017.2197860.057221399.0172.4285717.047619
77178095411.9116.012.02057.061.088.7198360.03352041.0171.4166673.833333
881531160767.900.091.038194.02379.025.5434640.243316474.0419.7142866.230769
99160982005.6387.07.0613.067.029.9347760.0243900.087.5714294.857143

Last rows

df_indexcustomer_idgross_revenuerecency_daysqty_invoice_noqty_itemsqty_productsavg_ticketfrequencyqty_returnsavg_basket_sizeavg_unique_basket_size
56835774227004839.421.01.01074.062.078.0551611.00.01074.055.0
5684577513298360.001.01.096.02.0180.0000001.00.096.02.0
5685577614569227.391.01.079.012.018.9491671.00.079.010.0
568657772270417.901.01.014.07.02.5571431.00.014.07.0
56875778227053.351.01.02.02.01.6750001.00.02.02.0
56885779227065699.001.01.01747.0634.08.9889591.00.01747.0634.0
56895780227076756.060.01.02010.0730.09.2548771.00.02010.0730.0
56905781227083217.200.01.0654.059.054.5288141.00.0654.056.0
56915782227093950.720.01.0731.0217.018.2060831.00.0731.0217.0
5692578312713794.550.01.0505.037.021.4743241.00.0505.037.0